Weakly Supervised Learning for Structured Output Prediction

نویسنده

  • M. Pawan Kumar
چکیده

We consider the problem of learning the parameters of a structured output prediction model, that is, learning to predict elements of a complex interdependent output space that correspond to a given input. Unlike many of the existing approaches, we focus on the weakly supervised setting, where most (or all) of the training samples have only been partially annotated. Given such a weakly supervised dataset, our goal is to estimate accurate parameters of the model by minimizing the regularized empirical risk, where the risk is measured by a user-specified loss function. This task has previously been addressed by the well-known latent support vector machine (latent svm) framework. We argue that, while latent svm offers a computational efficient solution to loss-based weakly supervised learning, it suffers from the following three drawbacks: (i) the optimization problem corresponding to latent svm is a difference-of-convex program, which is nonconvex, and hence susceptible to bad local minimum solutions; (ii) the prediction rule of latent svm only relies on the most likely value of the latent variables, and not the uncertainty in the latent variable values; and (iii) the loss function used to measure the risk is restricted to be independent of true (unknown) value of the latent variables. We address the the aforementioned drawbacks using three novel contributions. First, inspired by human learning, we design an automatic self-paced learning algorithm for latent svm, which builds on the intuition that the learner should be presented in the training samples in a meaningful order that facilitates learning: starting frome easy samples and gradually moving to harder samples. Our algorithm simultaneously selects the easy samples and updates the parameters at each iteration by solving a biconvex optimization problem. Second, we propose a new family of lvms called max-margin min-entropy (m3e) models, which includes latent svm as a special case. Given an input, an m3e model predicts the output with the smallest corresponding Rényi entropy of generalized distribution, which relies not only on the probability of the output but also the uncertainty of the latent variable values. Third, we propose a novel learning framework for learning with general loss functions that may depend on the latent variables. Specifically, our framework simultaneously estimates two distributions: (i) a conditional distribution to model the uncertainty of the latent variables for a given input-output pair; and (ii) a delta distribution to predict the output and the latent variables for a given input. During learning, we encourage agreement between the two distributions by minimizing a loss-based dissimilarity coefficient. We demonstrate the efficacy of our contributions on standard machine learning applications using publicly available datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structured Output Learning with Candidate Labels for Local Parts

This paper introduces a special setting of weakly supervised structured output learning, where the training data is a set of structured instances and supervision involves candidate labels for some local parts of the structure. We show that the learning problem with this weak supervision setting can be efficiently handled and then propose a large margin formulation. To solve the non-convex optim...

متن کامل

Input Output Kernel Regression: Supervised and Semi-Supervised Structured Output Prediction with Operator-Valued Kernels

In this paper, we introduce a novel approach, called Input Output Kernel Regression (IOKR), for learning mappings between structured inputs and structured outputs. The approach belongs to the family of Output Kernel Regression methods devoted to regression in feature space endowed with some output kernel. In order to take into account structure in input data and benefit from kernels in the inpu...

متن کامل

Semi-supervised structured prediction models

Learning mappings between arbitrary structured input and output variables is a fundamental problem in machine learning. It covers many natural learning tasks and challenges the standard model of learning a mapping from independently drawn instances to a small set of labels. Potential applications include classification with a class taxonomy, named entity recognition, and natural language parsin...

متن کامل

Large Margin Semi-supervised Structured Output Learning

In structured output learning, obtaining labeled data for real-world applications is usually costly, while unlabeled examples are available in abundance. Semi-supervised structured classification has been developed to handle large amounts of unlabeled structured data. In this work, we consider semi-supervised structural SVMs with domain constraints. The optimization problem, which in general is...

متن کامل

Tractable Semi-supervised Learning of Complex Structured Prediction Models

• Multi-label Classification (e.g., a document belongs to more than one class finance, politics) • Sequence Learning (e.g, input: a sentence; output: POS Tags) The President Came to the office DT N V P DT N • In this paper, we consider general structures Characteristics: • Exponential number of output combinations for a given input (e.g., 2 in K output multi-label classification problem) • Labe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013